Overview

Dataset statistics

Number of variables13
Number of observations2968
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory301.6 KiB
Average record size in memory104.0 B

Variable types

NUM13

Warnings

qtde_items is highly correlated with gross_revenueHigh correlation
gross_revenue is highly correlated with qtde_itemsHigh correlation
avg_ticket is highly skewed (γ1 = 25.15706781) Skewed
frequency is highly skewed (γ1 = 24.87675009) Skewed
qtde_returns is highly skewed (γ1 = 21.9754032) Skewed
df_index has unique values Unique
customer_id has unique values Unique
avg_ticket has unique values Unique
recency_days has 33 (1.1%) zeros Zeros
qtde_returns has 1481 (49.9%) zeros Zeros

Reproduction

Analysis started2021-06-30 01:51:13.053459
Analysis finished2021-06-30 01:51:55.843677
Duration42.79 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct2968
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2316.666442
Minimum0
Maximum5714
Zeros1
Zeros (%)< 0.1%
Memory size23.2 KiB
2021-06-29T22:51:56.025776image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile185.35
Q1928.5
median2119.5
Q33536.25
95-th percentile5034.3
Maximum5714
Range5714
Interquartile range (IQR)2607.75

Descriptive statistics

Standard deviation1554.722712
Coefficient of variation (CV)0.6711033938
Kurtosis-1.010637904
Mean2316.666442
Median Absolute Deviation (MAD)1270.5
Skewness0.3426249769
Sum6875866
Variance2417162.71
MonotocityStrictly increasing
2021-06-29T22:51:56.257785image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01< 0.1%
 
26701< 0.1%
 
26581< 0.1%
 
45641< 0.1%
 
26601< 0.1%
 
6131< 0.1%
 
26621< 0.1%
 
6151< 0.1%
 
48241< 0.1%
 
6191< 0.1%
 
Other values (2958)295899.7%
 
ValueCountFrequency (%) 
01< 0.1%
 
11< 0.1%
 
21< 0.1%
 
31< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
57141< 0.1%
 
56951< 0.1%
 
56851< 0.1%
 
56791< 0.1%
 
56581< 0.1%
 

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2968
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15270.37702
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-29T22:51:56.519264image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12619.35
Q113798.75
median15220.5
Q316768.5
95-th percentile17964.65
Maximum18287
Range5940
Interquartile range (IQR)2969.75

Descriptive statistics

Standard deviation1719.144523
Coefficient of variation (CV)0.1125803587
Kurtosis-1.206178196
Mean15270.37702
Median Absolute Deviation (MAD)1489
Skewness0.03219371129
Sum45322479
Variance2955457.892
MonotocityNot monotonic
2021-06-29T22:51:56.750366image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
163841< 0.1%
 
181641< 0.1%
 
129331< 0.1%
 
129351< 0.1%
 
149841< 0.1%
 
170331< 0.1%
 
137041< 0.1%
 
129391< 0.1%
 
170371< 0.1%
 
141251< 0.1%
 
Other values (2958)295899.7%
 
ValueCountFrequency (%) 
123471< 0.1%
 
123481< 0.1%
 
123521< 0.1%
 
123561< 0.1%
 
123581< 0.1%
 
ValueCountFrequency (%) 
182871< 0.1%
 
182831< 0.1%
 
182821< 0.1%
 
182771< 0.1%
 
182761< 0.1%
 

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION

Distinct2962
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2693.389373
Minimum6.2
Maximum279138.02
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-29T22:51:56.990037image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile229.7325
Q1570.845
median1085.51
Q32306.905
95-th percentile7169.562
Maximum279138.02
Range279131.82
Interquartile range (IQR)1736.06

Descriptive statistics

Standard deviation10135.32607
Coefficient of variation (CV)3.763037818
Kurtosis397.3184084
Mean2693.389373
Median Absolute Deviation (MAD)671.39
Skewness17.63574461
Sum7993979.66
Variance102724834.5
MonotocityNot monotonic
2021-06-29T22:51:57.212419image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
745.0620.1%
 
33120.1%
 
734.9420.1%
 
379.6520.1%
 
533.3320.1%
 
731.920.1%
 
889.931< 0.1%
 
471.511< 0.1%
 
13375.871< 0.1%
 
284.461< 0.1%
 
Other values (2952)295299.5%
 
ValueCountFrequency (%) 
6.21< 0.1%
 
13.31< 0.1%
 
151< 0.1%
 
36.561< 0.1%
 
451< 0.1%
 
ValueCountFrequency (%) 
279138.021< 0.1%
 
259657.31< 0.1%
 
194550.791< 0.1%
 
140438.721< 0.1%
 
124564.531< 0.1%
 

recency_days
Real number (ℝ≥0)

ZEROS

Distinct272
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.31030997
Minimum0
Maximum373
Zeros33
Zeros (%)1.1%
Memory size23.2 KiB
2021-06-29T22:51:57.493654image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q111
median31
Q381
95-th percentile242
Maximum373
Range373
Interquartile range (IQR)70

Descriptive statistics

Standard deviation77.76031378
Coefficient of variation (CV)1.209142264
Kurtosis2.77659321
Mean64.31030997
Median Absolute Deviation (MAD)26
Skewness1.79807024
Sum190873
Variance6046.666399
MonotocityNot monotonic
2021-06-29T22:51:57.732241image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1993.3%
 
4872.9%
 
2852.9%
 
3852.9%
 
8762.6%
 
10672.3%
 
7662.2%
 
9662.2%
 
17642.2%
 
22551.9%
 
Other values (262)221874.7%
 
ValueCountFrequency (%) 
0331.1%
 
1993.3%
 
2852.9%
 
3852.9%
 
4872.9%
 
ValueCountFrequency (%) 
37320.1%
 
37240.1%
 
3711< 0.1%
 
3681< 0.1%
 
36640.1%
 

qtde_invoices
Real number (ℝ≥0)

Distinct56
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.724056604
Minimum1
Maximum206
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-29T22:51:57.983849image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile17
Maximum206
Range205
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.857882575
Coefficient of variation (CV)1.5474834
Kurtosis190.7771511
Mean5.724056604
Median Absolute Deviation (MAD)2
Skewness10.76520644
Sum16989
Variance78.46208371
MonotocityNot monotonic
2021-06-29T22:51:58.227341image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
278526.4%
 
349816.8%
 
439313.2%
 
52378.0%
 
11906.4%
 
61735.8%
 
71384.6%
 
8983.3%
 
9692.3%
 
10551.9%
 
Other values (46)33211.2%
 
ValueCountFrequency (%) 
11906.4%
 
278526.4%
 
349816.8%
 
439313.2%
 
52378.0%
 
ValueCountFrequency (%) 
2061< 0.1%
 
1991< 0.1%
 
1241< 0.1%
 
971< 0.1%
 
9120.1%
 

qtde_items
Real number (ℝ≥0)

HIGH CORRELATION

Distinct1664
Distinct (%)56.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1579.712264
Minimum1
Maximum196844
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-29T22:51:58.511841image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile101.35
Q1296
median638
Q31398.25
95-th percentile4403.25
Maximum196844
Range196843
Interquartile range (IQR)1102.25

Descriptive statistics

Standard deviation5700.529956
Coefficient of variation (CV)3.608587516
Kurtosis518.1228414
Mean1579.712264
Median Absolute Deviation (MAD)419
Skewness18.7602581
Sum4688586
Variance32496041.78
MonotocityNot monotonic
2021-06-29T22:51:58.770748image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
310110.4%
 
8890.3%
 
15090.3%
 
28880.3%
 
8480.3%
 
27280.3%
 
24680.3%
 
26080.3%
 
11470.2%
 
13470.2%
 
Other values (1654)288597.2%
 
ValueCountFrequency (%) 
11< 0.1%
 
220.1%
 
1220.1%
 
161< 0.1%
 
171< 0.1%
 
ValueCountFrequency (%) 
1968441< 0.1%
 
799631< 0.1%
 
773731< 0.1%
 
699931< 0.1%
 
645491< 0.1%
 

qtde_products
Real number (ℝ≥0)

Distinct469
Distinct (%)15.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122.7456199
Minimum1
Maximum7837
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-29T22:51:59.035056image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q129
median67
Q3135
95-th percentile382
Maximum7837
Range7836
Interquartile range (IQR)106

Descriptive statistics

Standard deviation269.8785162
Coefficient of variation (CV)2.198681439
Kurtosis354.7550751
Mean122.7456199
Median Absolute Deviation (MAD)44
Skewness15.70464041
Sum364309
Variance72834.41353
MonotocityNot monotonic
2021-06-29T22:51:59.290977image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
28451.5%
 
20381.3%
 
35351.2%
 
19331.1%
 
15331.1%
 
29331.1%
 
11321.1%
 
26311.0%
 
27301.0%
 
16291.0%
 
Other values (459)262988.6%
 
ValueCountFrequency (%) 
160.2%
 
2140.5%
 
3150.5%
 
4170.6%
 
5260.9%
 
ValueCountFrequency (%) 
78371< 0.1%
 
56701< 0.1%
 
50951< 0.1%
 
45771< 0.1%
 
26981< 0.1%
 

avg_ticket
Real number (ℝ≥0)

SKEWED
UNIQUE

Distinct2968
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.99655282
Minimum2.150588235
Maximum4453.43
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-29T22:51:59.544210image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.915887985
Q113.11811111
median17.96548505
Q324.98179365
95-th percentile90.052125
Maximum4453.43
Range4451.279412
Interquartile range (IQR)11.86368254

Descriptive statistics

Standard deviation119.5318165
Coefficient of variation (CV)3.622554671
Kurtosis812.969606
Mean32.99655282
Median Absolute Deviation (MAD)5.980669355
Skewness25.15706781
Sum97933.76878
Variance14287.85517
MonotocityNot monotonic
2021-06-29T22:51:59.777245image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
17.492758621< 0.1%
 
33.535714291< 0.1%
 
17.628961751< 0.1%
 
28.899687941< 0.1%
 
46.074130431< 0.1%
 
25.775384621< 0.1%
 
8.7451724141< 0.1%
 
18.150615381< 0.1%
 
17.943444441< 0.1%
 
15.98451< 0.1%
 
Other values (2958)295899.7%
 
ValueCountFrequency (%) 
2.1505882351< 0.1%
 
2.43251< 0.1%
 
2.4623711341< 0.1%
 
2.5112413791< 0.1%
 
2.5153333331< 0.1%
 
ValueCountFrequency (%) 
4453.431< 0.1%
 
3202.921< 0.1%
 
1687.21< 0.1%
 
952.98751< 0.1%
 
872.131< 0.1%
 

avg_recency_days
Real number (ℝ≥0)

Distinct1258
Distinct (%)42.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67.30505288
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-29T22:52:00.014123image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q125.9271978
median48.26785714
Q385.33333333
95-th percentile200.65
Maximum366
Range365
Interquartile range (IQR)59.40613553

Descriptive statistics

Standard deviation63.50325927
Coefficient of variation (CV)0.9435139941
Kurtosis4.908645262
Mean67.30505288
Median Absolute Deviation (MAD)26.26785714
Skewness2.06622239
Sum199761.397
Variance4032.663938
MonotocityNot monotonic
2021-06-29T22:52:00.259548image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
14250.8%
 
4220.7%
 
70210.7%
 
7200.7%
 
35190.6%
 
49180.6%
 
46170.6%
 
11170.6%
 
21170.6%
 
1160.5%
 
Other values (1248)277693.5%
 
ValueCountFrequency (%) 
1160.5%
 
1.51< 0.1%
 
2130.4%
 
2.51< 0.1%
 
2.6013986011< 0.1%
 
ValueCountFrequency (%) 
3661< 0.1%
 
3651< 0.1%
 
3631< 0.1%
 
3621< 0.1%
 
35720.1%
 

frequency
Real number (ℝ≥0)

SKEWED

Distinct1225
Distinct (%)41.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1138262908
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-29T22:52:00.507595image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.008893504781
Q10.01633986928
median0.02589835169
Q30.04942659085
95-th percentile1
Maximum17
Range16.99455041
Interquartile range (IQR)0.03308672157

Descriptive statistics

Standard deviation0.4082214549
Coefficient of variation (CV)3.586354717
Kurtosis989.0590635
Mean0.1138262908
Median Absolute Deviation (MAD)0.0121968864
Skewness24.87675009
Sum337.8364311
Variance0.1666447562
MonotocityNot monotonic
2021-06-29T22:52:00.741756image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
11986.7%
 
0.02777777778170.6%
 
0.0625170.6%
 
0.02380952381160.5%
 
0.08333333333150.5%
 
0.09090909091150.5%
 
0.03448275862140.5%
 
0.02941176471140.5%
 
0.03571428571130.4%
 
0.02564102564130.4%
 
Other values (1215)263688.8%
 
ValueCountFrequency (%) 
0.0054495912811< 0.1%
 
0.0054644808741< 0.1%
 
0.0054794520551< 0.1%
 
0.0054945054951< 0.1%
 
0.00558659217920.1%
 
ValueCountFrequency (%) 
171< 0.1%
 
31< 0.1%
 
260.2%
 
1.1428571431< 0.1%
 
11986.7%
 

qtde_returns
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct213
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.88847709
Minimum0
Maximum9014
Zeros1481
Zeros (%)49.9%
Memory size23.2 KiB
2021-06-29T22:52:01.009234image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q39
95-th percentile100
Maximum9014
Range9014
Interquartile range (IQR)9

Descriptive statistics

Standard deviation282.864784
Coefficient of variation (CV)8.107685048
Kurtosis596.2019916
Mean34.88847709
Median Absolute Deviation (MAD)1
Skewness21.9754032
Sum103549
Variance80012.48604
MonotocityNot monotonic
2021-06-29T22:52:01.244600image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0148149.9%
 
11645.5%
 
21485.0%
 
31053.5%
 
4893.0%
 
6782.6%
 
5612.1%
 
12511.7%
 
7431.4%
 
8431.4%
 
Other values (203)70523.8%
 
ValueCountFrequency (%) 
0148149.9%
 
11645.5%
 
21485.0%
 
31053.5%
 
4893.0%
 
ValueCountFrequency (%) 
90141< 0.1%
 
80041< 0.1%
 
44271< 0.1%
 
37681< 0.1%
 
33321< 0.1%
 

avg_basket_size
Real number (ℝ≥0)

Distinct1972
Distinct (%)66.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean235.7885065
Minimum1
Maximum6009.333333
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-29T22:52:01.478341image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1103.2375
median172
Q3281.375
95-th percentile598.345
Maximum6009.333333
Range6008.333333
Interquartile range (IQR)178.1375

Descriptive statistics

Standard deviation283.7237528
Coefficient of variation (CV)1.203297637
Kurtosis103.0742725
Mean235.7885065
Median Absolute Deviation (MAD)82.625
Skewness7.717538936
Sum699820.2873
Variance80499.16789
MonotocityNot monotonic
2021-06-29T22:52:01.704104image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
100110.4%
 
114100.3%
 
8690.3%
 
7390.3%
 
8290.3%
 
7580.3%
 
6080.3%
 
8880.3%
 
13680.3%
 
19070.2%
 
Other values (1962)288197.1%
 
ValueCountFrequency (%) 
120.1%
 
21< 0.1%
 
3.3333333331< 0.1%
 
5.3333333331< 0.1%
 
5.6666666671< 0.1%
 
ValueCountFrequency (%) 
6009.3333331< 0.1%
 
42821< 0.1%
 
39061< 0.1%
 
3868.651< 0.1%
 
28801< 0.1%
 

avg_unique_basket_size
Real number (ℝ≥0)

Distinct910
Distinct (%)30.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.49039145
Minimum0.2
Maximum259
Zeros0
Zeros (%)0.0%
Memory size23.2 KiB
2021-06-29T22:52:01.937005image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.2
5-th percentile2
Q17.666666667
median13.6
Q322.03571429
95-th percentile46
Maximum259
Range258.8
Interquartile range (IQR)14.36904762

Descriptive statistics

Standard deviation15.4620774
Coefficient of variation (CV)0.8840326672
Kurtosis29.30304319
Mean17.49039145
Median Absolute Deviation (MAD)6.6
Skewness3.434441407
Sum51911.48183
Variance239.0758377
MonotocityNot monotonic
2021-06-29T22:52:02.188400image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
13431.4%
 
9421.4%
 
16411.4%
 
8391.3%
 
17371.2%
 
14371.2%
 
11361.2%
 
7361.2%
 
15341.1%
 
5341.1%
 
Other values (900)258987.2%
 
ValueCountFrequency (%) 
0.21< 0.1%
 
0.2530.1%
 
0.333333333360.2%
 
0.41< 0.1%
 
0.40909090911< 0.1%
 
ValueCountFrequency (%) 
2591< 0.1%
 
1771< 0.1%
 
1481< 0.1%
 
1271< 0.1%
 
1051< 0.1%
 

Interactions

2021-06-29T22:51:18.224563image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:18.445941image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:18.643330image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:18.838130image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:19.045988image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:19.241938image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:19.467927image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:19.687451image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:19.882171image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:20.088869image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:20.299760image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:20.510380image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:20.709463image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:20.905368image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:21.113849image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:21.340522image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:21.550415image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:21.769463image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:21.978616image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:22.195918image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:22.407916image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:22.623800image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:22.838965image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:23.054389image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:23.257396image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:23.486044image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:23.706791image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:23.906667image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:24.123119image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:24.341231image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:24.557846image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:24.775521image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:25.246671image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:25.465745image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:25.660169image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:25.891890image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:26.104080image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:26.309246image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:26.522931image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:26.728282image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:26.947251image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:27.182275image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:27.414258image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:27.653074image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:27.869146image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:28.109919image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:28.359530image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:28.581305image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:28.801473image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:29.018670image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:29.233990image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:29.456237image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:29.661739image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:29.853740image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:30.047730image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:30.244825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:30.449380image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:30.631843image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:30.841774image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:31.049470image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:31.241275image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:31.447280image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:31.645355image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:31.836537image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:32.038249image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:32.234250image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:32.454250image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:32.674974image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:32.899965image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:33.133865image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:33.358155image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:33.593517image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:33.831525image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:34.050234image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:34.281932image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:34.519107image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:34.735106image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:34.958351image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:35.183517image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:35.402596image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:35.620663image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:35.836308image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:36.338548image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:36.561729image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:36.798170image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:37.037556image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:37.258221image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:37.486753image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:37.717235image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:37.932785image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:38.161585image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:38.376135image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:38.560645image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:38.750841image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:38.946432image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:39.143510image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:39.346195image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:39.556119image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:39.755252image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:39.939413image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:40.158003image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:40.358214image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:40.548496image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:40.746640image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:40.941177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:41.147166image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:41.355167image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:41.560170image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:41.771641image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:41.977365image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:42.205371image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:42.437867image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:42.648236image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:42.864304image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:43.082965image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:43.297745image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:43.516572image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:43.734835image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:43.942823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:44.154895image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:44.375999image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:44.588061image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:44.793060image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:45.019940image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:45.242942image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:45.449709image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:45.673149image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:45.885357image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:46.094132image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:46.312348image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:46.528723image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:46.721953image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:46.917504image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:47.112506image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:47.321506image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:47.506734image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:47.708656image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:47.916371image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:48.106438image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:48.299440image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:48.492447image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:48.674386image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:48.866015image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:49.065467image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:49.654718image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:49.871131image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:50.083110image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:50.299674image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:50.501989image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:50.726450image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:50.950595image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:51.156271image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:51.368717image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:51.583724image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:51.788419image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:52.002382image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:52.213513image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:52.407591image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:52.616687image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:52.816767image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:53.024738image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:53.221692image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:53.438964image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:53.654405image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:53.868066image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:54.083708image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:54.303373image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:54.494757image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:54.701497image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-06-29T22:52:02.417808image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-29T22:52:03.230866image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-29T22:52:03.613243image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-29T22:52:03.997734image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-06-29T22:51:55.146227image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-29T22:51:55.621071image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysqtde_invoicesqtde_itemsqtde_productsavg_ticketavg_recency_daysfrequencyqtde_returnsavg_basket_sizeavg_unique_basket_size
00178505391.21372.0034.001733.00297.0018.1535.5017.0040.0050.970.62
11130473232.5956.009.001390.00171.0018.9027.250.0335.00154.4411.67
22125836705.382.0015.005028.00232.0028.9023.190.0450.00335.207.60
3313748948.2595.005.00439.0028.0033.8792.670.020.0087.804.80
4415100876.00333.003.0080.003.00292.008.600.0722.0026.670.33
55152914623.3025.0014.002102.00102.0045.3323.200.0429.00150.144.36
66146885630.877.0021.003621.00327.0017.2218.300.06399.00172.437.05
77178095411.9116.0012.002057.0061.0088.7235.700.0341.00171.423.83
881531160767.900.0091.0038194.002379.0025.544.140.24474.00419.716.23
99160982005.6387.007.00613.0067.0029.9347.670.020.0087.574.86

Last rows

df_indexcustomer_idgross_revenuerecency_daysqtde_invoicesqtde_itemsqtde_productsavg_ticketavg_recency_daysfrequencyqtde_returnsavg_basket_sizeavg_unique_basket_size
29585626177271060.2515.001.00645.0066.0016.066.001.006.00645.0066.00
2959563617232421.522.002.00203.0036.0011.7112.000.150.00101.5015.00
2960563717468137.0010.002.00116.005.0027.404.000.400.0058.002.50
2961564813596697.045.002.00406.00166.004.207.000.250.00203.0066.50
29625654148931237.859.002.00799.0073.0016.962.000.670.00399.5036.00
2963565812479473.2011.001.00382.0030.0015.774.001.0034.00382.0030.00
2964567914126706.137.003.00508.0015.0047.083.000.7550.00169.334.67
29655685135211092.391.003.00733.00435.002.514.500.300.00244.33104.00
2966569515060301.848.004.00262.00120.002.521.002.000.0065.5020.00
2967571412558269.967.001.00196.0011.0024.546.001.00196.00196.0011.00